home *** CD-ROM | disk | FTP | other *** search
-
-
-
- TTTTeeeexxxxtttt::::::::PPPPaaaarrrrsssseeeeWWWWoooorrrrddddssss((((3333)))) TTTTeeeexxxxtttt::::::::PPPPaaaarrrrsssseeeeWWWWoooorrrrddddssss((((3333))))
-
-
-
- NNNNAAAAMMMMEEEE
- Text::ParseWords - parse text into an array of tokens or array of arrays
-
- SSSSYYYYNNNNOOOOPPPPSSSSIIIISSSS
- use Text::ParseWords;
- @lists = &nested_quotewords($delim, $keep, @lines);
- @words = "ewords($delim, $keep, @lines);
- @words = &shellwords(@lines);
- @words = &parse_line($delim, $keep, $line);
- @words = &old_shellwords(@lines); # DEPRECATED!
-
-
- DDDDEEEESSSSCCCCRRRRIIIIPPPPTTTTIIIIOOOONNNN
- The &_n_e_s_t_e_d__q_u_o_t_e_w_o_r_d_s() and &_q_u_o_t_e_w_o_r_d_s() functions accept a delimiter
- (which can be a regular expression) and a list of lines and then breaks
- those lines up into a list of words ignoring delimiters that appear
- inside quotes. &_q_u_o_t_e_w_o_r_d_s() returns all of the tokens in a single long
- list, while &_n_e_s_t_e_d__q_u_o_t_e_w_o_r_d_s() returns a list of token lists
- corresponding to the elements of @lines. &_p_a_r_s_e__l_i_n_e() does tokenizing
- on a single string. The &*_q_u_o_t_e_w_o_r_d_s() functions simply call
- &_p_a_r_s_e__l_i_n_e_s(), so if you're only splitting one line you can call
- &_p_a_r_s_e__l_i_n_e_s() directly and save a function call.
-
- The $keep argument is a boolean flag. If true, then the tokens are split
- on the specified delimiter, but all other characters (quotes,
- backslashes, etc.) are kept in the tokens. If $keep is false then the
- &*_q_u_o_t_e_w_o_r_d_s() functions remove all quotes and backslashes that are not
- themselves backslash-escaped or inside of single quotes (i.e.,
- &_q_u_o_t_e_w_o_r_d_s() tries to interpret these characters just like the Bourne
- shell). NB: these semantics are significantly different from the
- original version of this module shipped with Perl 5.000 through 5.004.
- As an additional feature, $keep may be the keyword "delimiters" which
- causes the functions to preserve the delimiters in each string as tokens
- in the token lists, in addition to preserving quote and backslash
- characters.
-
- &_s_h_e_l_l_w_o_r_d_s() is written as a special case of &_q_u_o_t_e_w_o_r_d_s(), and it does
- token parsing with whitespace as a delimiter-- similar to most Unix
- shells.
-
- EEEEXXXXAAAAMMMMPPPPLLLLEEEESSSS
- The sample program:
-
- use Text::ParseWords;
- @words = "ewords('\s+', 0, q{this is "a test" of\ quotewords \"for you});
- $i = 0;
- foreach (@words) {
- print "$i: <$_>\n";
- $i++;
- }
-
- produces:
-
-
-
- PPPPaaaaggggeeee 1111
-
-
-
-
-
-
- TTTTeeeexxxxtttt::::::::PPPPaaaarrrrsssseeeeWWWWoooorrrrddddssss((((3333)))) TTTTeeeexxxxtttt::::::::PPPPaaaarrrrsssseeeeWWWWoooorrrrddddssss((((3333))))
-
-
-
- 0: <this>
- 1: <is>
- 2: <a test>
- 3: <of quotewords>
- 4: <"for>
- 5: <you>
-
- demonstrating:
-
- 0 a simple word
-
- 1 multiple spaces are skipped because of our $delim
-
- 2 use of quotes to include a space in a word
-
- 3 use of a backslash to include a space in a word
-
- 4 use of a backslash to remove the special meaning of a double-quote
-
- quote)
- 5 another simple word (note the lack of effect of the backslashed double-
-
- Replacing "ewords('\s+', 0, q{this is...}) with &shellwords(q{this
- is...}) is a simpler way to accomplish the same thing.
-
- AAAAUUUUTTTTHHHHOOOORRRRSSSS
- Maintainer is Hal Pomeranz <pomeranz@netcom.com>, 1994-1997 (Original
- author unknown). Much of the code for &_p_a_r_s_e__l_i_n_e() (including the
- primary regexp) from Joerk Behrends <jbehrends@multimediaproduzenten.de>.
-
- Examples section another documentation provided by John Heidemann
- <johnh@ISI.EDU>
-
- Bug reports, patches, and nagging provided by lots of folks-- thanks
- everybody! Special thanks to Michael Schwern <schwern@envirolink.org>
- for assuring me that a &_n_e_s_t_e_d__q_u_o_t_e_w_o_r_d_s() would be useful, and to Jeff
- Friedl <jfriedl@yahoo-inc.com> for telling me not to worry about error-
- checking (sort of-- you had to be there).
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- PPPPaaaaggggeeee 2222
-
-
-
-